Speech Interface Evaluation on Car Navigation System – Many Undesirable Utterances and Severe Noisy Speech –
نویسندگان
چکیده
Recently, ASR (Automatic Speech Recognition) functions have commercially been used for various consumer applications including car navigation systems. However, many technical and usability problems still exist before ASR applications are on real business use. Our goal is to make ASR technologies for a real business use. To do so, we first evaluate a car navigation interface which has ASR as an input method, and second evaluate an ASR module using real noisy in-car speech. For ASR applications, we envision mobile environments, e.g. mobile information service systems such as car navigation systems and cellular phones on which an embedded speech recognizer (Kokubo et. al., 2006) is running and which are connected to remote servers that support various information-seeking tasks. Taking a look at commercially available car navigation systems, currently over 75% systems have ASR interfaces, however, there are very few drivers who have experiences to use the ASR interfaces. What is the problem? This is caused by the ASR usability problems. In this chapter, we report two experimental evaluation results of ASR interface for mobile use, especially for car navigation applications. First, we evaluate the usability aspects of speech interface and second, we evaluate in-car noise speech problems to propose an effective method to cope with noisy speech. For the first evaluation, we use a prototype which has a promising speech interface called FlexibleShortcuts and Select&Voice produced by Waseda University (Nakano et. al., 2007). We found many undesirable OOV (Out-OfVocabulary) utterances which make the interface worse. From the second experiment to check car-noise problems, we propose an array microphone + Spectrum Subtraction (SS) technique to increase recognition accuracy.
منابع مشابه
Linguistic and Acoustic Changes of U Different Dialogue S
This paper presents the characteristic differences of acoustic and linguistic features observed in different spoken dialogue situations: human-human vs. human-machine interactions. We compare the acoustic and linguistic features of the user’s speech to a spoken dialogue system and to a human-operator in several landmark setting tasks for a car navigation system. It has been pointed out that spe...
متن کاملLinguistic and acoustic features depending on different situations - the experiments considering speech recognition rate
This paper presents the characteristic differences of linguistic and acoustic features observed in different spoken dialogue situations and with different dialogue partners: human-human vs. human-machine interactions. We compare the linguistic and acoustic features of the user’s speech to a spoken dialogue system and to a human operator in several goal setting and destination database searching...
متن کاملEvaluation of a Noise Adaptive Speech Aurora 3 Datab
In this paper, we present evaluation results of a noise adaptive speech recognition system with combination of several techniques for robust speech recognition. The evaluation was on AURORA 3 database which contains noisy digit utterances collected in real car environments through close-talking and hands-free microphones. The techniques in the system include segmentation, maximum likelihood lin...
متن کاملAn Experimental Multimodal Command Control Interface for Car Navigation Systems
An experimental multimodal system combining natural input modes such as speech, lip movement, and gaze is proposed in this paper. It benefits from novel human-computer interaction (HCI) modalities and from multimodal integration for tackling the problem of the HCI bottleneck. This system allows the user to select menu items on the screen by employing speech recognition, lip reading, and gaze tr...
متن کاملA Commercial Car Navigation System using Korean Large Vocabulary Automatic Speech Recognizer
In this paper, a Korean large vocabulary speech recognizer for an embedded car navigation device is introduced. The proposed speech recognizer identifies 450k point-of-interests within a resource-limited device without serious performance degradation under severe car-noise environments. Before launching the speech recognition application on the Korean retail market, a series of speech recogniti...
متن کامل